Volumetric Semantic Segmentation using Pyramid Context Features Supplemental Material

نویسندگان

  • Jonathan T. Barron
  • Pablo Arbeláez
  • Soile V. E. Keränen
  • Mark D. Biggin
  • David W. Knowles
  • Jitendra Malik
چکیده

Our “pyramid filtering” insight enables exact and extremely efficient per-voxel classification with minimal memory overhead. We will now demonstrate our improvement over existing techniques empirically and theoretically. For an empirical demonstration of efficiency, consider the following two alternatives to pyramid filtering: 1) the “sliding window” approach: iterate through every voxel in the input volume, and for each voxel construct and classify a feature vector. 2) we could use the fact that our feature and classifier are both linear, and could therefore be reduced to a single linear operation which amounts to filtering the volume with an extremely large filter — one with a support as large as the input volume. Such filtering is intractably expensive in the spatial domain, but is much more efficient in the Fourier domain: we can compute the FFT of a volume and a filter, do a component-wise multiplication in Fourier space, and then invert the FFT. In the case in which we have multiple feature channels, this can be sped up by summing the filter-responses in Fourier space and inverting the FFT only once. We will refer to this as the “FFT” approach. The “FFT” approach can be sped up by assuming that the Fourier-domain filters have been precomputed, which reduces computational demands but dramatically increases memory overhead. We will refer to this precomputed Fourier technique as “FFT+caching”. The sliding window approach is extremely common in object detection [5], and the FFT approach has also been explored previously [6]. And of course, there are many specialized ways to efficiently approximate these sliding-window type filtering operations [7, 8], but we will only consider general and exact techniques. In Figure 1 we compare our pyramid filtering technique against the three previously-described baselines. We see that only our technique performs well in terms of speed and memory overhead when n is large — which is crucial, as n ≈ 256 in our experiments. The sliding window technique has minimal memory overhead, but is intractably slow — Figure 1. Profiles of different methods for densely evaluating a linear classifier on a feature vector for every voxel in a volume (where we have 50 channels for the volume, as is common in our experiments). Here we show the speed (time taken to classify the entire volume) and memory overhead (memory required to classify the volume, not including the feature channels of the volume themselves) for different classification techniques and different size volumes. Only our pyramid filtering technique is both fast and memory-efficient when n is large (n ≈ 256 in our experiments). In fact, pyramid filtering is the only method which actually runs to completion when n is large – all other techniques run out of memory or never finish, which is why ours plots appear incomplete. Experiments were performed on a 2011 Macbook Pro.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Hybrid Algorithm based on Deep Learning and Restricted Boltzmann Machine for Car Semantic Segmentation from Unmanned Aerial Vehicles (UAVs)-based Thermal Infrared Images

Nowadays, ground vehicle monitoring (GVM) is one of the areas of application in the intelligent traffic control system using image processing methods. In this context, the use of unmanned aerial vehicles based on thermal infrared (UAV-TIR) images is one of the optimal options for GVM due to the suitable spatial resolution, cost-effective and low volume of images. The methods that have been prop...

متن کامل

Exploring Context with Deep Structured models for Semantic Segmentation

We propose an approach for exploiting contextual information in semantic image segmentation, and particularly investigate the use of patch-patch context and patch-background context in deep CNNs. We formulate deep structured models by combining CNNs and Conditional Random Fields (CRFs) for learning the patch-patch context between image regions. Specifically, we formulate CNN-based pairwise pote...

متن کامل

Rethinking Atrous Convolution for Semantic Image Segmentation

In this work, we revisit atrous convolution, a powerful tool to explicitly adjust filter’s field-of-view as well as control the resolution of feature responses computed by Deep Convolutional Neural Networks, in the application of semantic image segmentation. To handle the problem of segmenting objects at multiple scales, we design modules which employ atrous convolution in cascade or in paralle...

متن کامل

Segmentation of volume images using a multiscale transform

Thi s paper presents a new method for multiscale segmentation of volume images. The segmentation i s achieved using a recent nonlinear transform which leads to well-characterized regions at dif ferent spatial and intens i ty scales. T h e detected three-dimensional regions are closed and are homogeneous relative to their surround. A pyramid is generated containing the region information extract...

متن کامل

Integrating neural networks with image pyramids to learn target context

AImtraet--The utility o f combining neural networks with pyramid representations for target detection in aerial imagery is explored. First, it is shown that a neural network constructed using relatively simple pyramid features is a more effective detector, in terms o f its sensitivity, than a network which utilizes more complex object-taned features. Next, an architecture that supports coarse-t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013